Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 6000 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 19 |
| Duplicate rows (%) | 0.3% |
| Total size in memory | 609.5 KiB |
| Average record size in memory | 104.0 B |
Variable types
| Categorical | 1 |
|---|---|
| Numeric | 12 |
| Dataset has 19 (0.3%) duplicate rows | Duplicates |
B365H is highly correlated with B365A and 6 other fields | High correlation |
B365D is highly correlated with IWD and 2 other fields | High correlation |
B365A is highly correlated with B365H and 6 other fields | High correlation |
IWH is highly correlated with B365H and 6 other fields | High correlation |
IWD is highly correlated with B365D and 2 other fields | High correlation |
IWA is highly correlated with B365H and 6 other fields | High correlation |
LBH is highly correlated with B365H and 6 other fields | High correlation |
LBD is highly correlated with B365D and 2 other fields | High correlation |
LBA is highly correlated with B365H and 6 other fields | High correlation |
WHH is highly correlated with B365H and 6 other fields | High correlation |
WHD is highly correlated with B365D and 2 other fields | High correlation |
WHA is highly correlated with B365H and 6 other fields | High correlation |
B365H is highly correlated with B365A and 6 other fields | High correlation |
B365D is highly correlated with B365A and 6 other fields | High correlation |
B365A is highly correlated with B365H and 10 other fields | High correlation |
IWH is highly correlated with B365H and 6 other fields | High correlation |
IWD is highly correlated with B365D and 6 other fields | High correlation |
IWA is highly correlated with B365H and 10 other fields | High correlation |
LBH is highly correlated with B365H and 6 other fields | High correlation |
LBD is highly correlated with B365D and 6 other fields | High correlation |
LBA is highly correlated with B365H and 10 other fields | High correlation |
WHH is highly correlated with B365H and 6 other fields | High correlation |
WHD is highly correlated with B365D and 6 other fields | High correlation |
WHA is highly correlated with B365H and 10 other fields | High correlation |
B365H is highly correlated with B365A and 6 other fields | High correlation |
B365D is highly correlated with IWD and 2 other fields | High correlation |
B365A is highly correlated with B365H and 6 other fields | High correlation |
IWH is highly correlated with B365H and 6 other fields | High correlation |
IWD is highly correlated with B365D and 2 other fields | High correlation |
IWA is highly correlated with B365H and 6 other fields | High correlation |
LBH is highly correlated with B365H and 6 other fields | High correlation |
LBD is highly correlated with B365D and 2 other fields | High correlation |
LBA is highly correlated with B365H and 6 other fields | High correlation |
WHH is highly correlated with B365H and 6 other fields | High correlation |
WHD is highly correlated with B365D and 2 other fields | High correlation |
WHA is highly correlated with B365H and 6 other fields | High correlation |
B365H is highly correlated with IWH and 2 other fields | High correlation |
B365D is highly correlated with B365A and 7 other fields | High correlation |
B365A is highly correlated with B365D and 6 other fields | High correlation |
IWH is highly correlated with B365H and 3 other fields | High correlation |
IWD is highly correlated with B365D and 7 other fields | High correlation |
IWA is highly correlated with B365D and 6 other fields | High correlation |
LBH is highly correlated with B365H and 2 other fields | High correlation |
LBD is highly correlated with B365D and 6 other fields | High correlation |
LBA is highly correlated with B365D and 6 other fields | High correlation |
WHH is highly correlated with B365H and 3 other fields | High correlation |
WHD is highly correlated with B365D and 6 other fields | High correlation |
WHA is highly correlated with B365D and 6 other fields | High correlation |
Reproduction
| Analysis started | 2021-12-27 01:00:22.703668 |
|---|---|
| Analysis finished | 2021-12-27 01:00:45.753665 |
| Duration | 23.05 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
FTR
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 47.0 KiB |
| NH | |
|---|---|
| H |
Length
| Max length | 2 |
|---|---|
| Median length | 2 |
| Mean length | 1.5355 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | NH |
|---|---|
| 2nd row | NH |
| 3rd row | NH |
| 4th row | H |
| 5th row | H |
Common Values
| Value | Count | Frequency (%) |
| NH | 3213 | |
| H | 2787 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| nh | 3213 | |
| h | 2787 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 144 |
|---|---|
| Distinct (%) | 2.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.701662333 |
| Minimum | 1.08 |
|---|---|
| Maximum | 23 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.0 KiB |
Quantile statistics
| Minimum | 1.08 |
|---|---|
| 5-th percentile | 1.25 |
| Q1 | 1.67 |
| median | 2.15 |
| Q3 | 2.87 |
| 95-th percentile | 6.5 |
| Maximum | 23 |
| Range | 21.92 |
| Interquartile range (IQR) | 1.2 |
Descriptive statistics
| Standard deviation | 1.796118455 |
|---|---|
| Coefficient of variation (CV) | 0.6648197418 |
| Kurtosis | 11.51518168 |
| Mean | 2.701662333 |
| Median Absolute Deviation (MAD) | 0.54 |
| Skewness | 2.794196935 |
| Sum | 16209.974 |
| Variance | 3.226041504 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2.1 | 280 | 4.7% |
| 2.5 | 241 | 4.0% |
| 2 | 213 | 3.5% |
| 2.2 | 190 | 3.2% |
| 2.25 | 167 | 2.8% |
| 2.3 | 158 | 2.6% |
| 1.8 | 154 | 2.6% |
| 1.83 | 149 | 2.5% |
| 2.4 | 131 | 2.2% |
| 1.25 | 117 | 1.9% |
| Other values (134) | 4200 |
| Value | Count | Frequency (%) |
| 1.08 | 3 | 0.1% |
| 1.09 | 1 | < 0.1% |
| 1.1 | 6 | 0.1% |
| 1.11 | 8 | 0.1% |
| 1.12 | 7 | 0.1% |
| 1.13 | 8 | 0.1% |
| 1.14 | 29 | |
| 1.143 | 2 | < 0.1% |
| 1.16 | 17 | |
| 1.167 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 23 | 1 | < 0.1% |
| 17 | 3 | 0.1% |
| 15 | 4 | 0.1% |
| 14 | 1 | < 0.1% |
| 13 | 8 | 0.1% |
| 12 | 11 | 0.2% |
| 11 | 15 | |
| 10 | 19 | |
| 9.5 | 8 | 0.1% |
| 9 | 36 |
| Distinct | 50 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.890617 |
| Minimum | 2.5 |
|---|---|
| Maximum | 13 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.0 KiB |
Quantile statistics
| Minimum | 2.5 |
|---|---|
| 5-th percentile | 3.2 |
| Q1 | 3.25 |
| median | 3.5 |
| Q3 | 4 |
| 95-th percentile | 6 |
| Maximum | 13 |
| Range | 10.5 |
| Interquartile range (IQR) | 0.75 |
Descriptive statistics
| Standard deviation | 1.067555647 |
|---|---|
| Coefficient of variation (CV) | 0.2743923771 |
| Kurtosis | 11.74352662 |
| Mean | 3.890617 |
| Median Absolute Deviation (MAD) | 0.25 |
| Skewness | 2.927512835 |
| Sum | 23343.702 |
| Variance | 1.139675059 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3.25 | 847 | |
| 3.4 | 717 | |
| 3.3 | 558 | 9.3% |
| 3.2 | 528 | 8.8% |
| 3.5 | 496 | 8.3% |
| 3.6 | 491 | 8.2% |
| 4 | 290 | 4.8% |
| 3.75 | 249 | 4.2% |
| 4.5 | 177 | 2.9% |
| 5 | 167 | 2.8% |
| Other values (40) | 1480 |
| Value | Count | Frequency (%) |
| 2.5 | 1 | < 0.1% |
| 2.875 | 1 | < 0.1% |
| 3 | 30 | 0.5% |
| 3.1 | 131 | 2.2% |
| 3.2 | 528 | |
| 3.25 | 847 | |
| 3.29 | 20 | 0.3% |
| 3.3 | 558 | |
| 3.39 | 27 | 0.4% |
| 3.4 | 717 |
| Value | Count | Frequency (%) |
| 13 | 4 | 0.1% |
| 12 | 2 | < 0.1% |
| 11 | 8 | 0.1% |
| 10 | 6 | 0.1% |
| 9.5 | 3 | 0.1% |
| 9 | 16 | 0.3% |
| 8.5 | 15 | 0.2% |
| 8 | 31 | |
| 7.5 | 29 | |
| 7 | 65 |
| Distinct | 133 |
|---|---|
| Distinct (%) | 2.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.828635333 |
| Minimum | 1.16 |
|---|---|
| Maximum | 34 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.0 KiB |
Quantile statistics
| Minimum | 1.16 |
|---|---|
| 5-th percentile | 1.55 |
| Q1 | 2.5 |
| median | 3.5 |
| Q3 | 5.5 |
| 95-th percentile | 13 |
| Maximum | 34 |
| Range | 32.84 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 3.939662179 |
|---|---|
| Coefficient of variation (CV) | 0.8158955703 |
| Kurtosis | 6.950641847 |
| Mean | 4.828635333 |
| Median Absolute Deviation (MAD) | 1.25 |
| Skewness | 2.366499065 |
| Sum | 28971.812 |
| Variance | 15.52093809 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4.5 | 198 | 3.3% |
| 3 | 196 | 3.3% |
| 4 | 195 | 3.2% |
| 3.4 | 177 | 2.9% |
| 5 | 165 | 2.8% |
| 3.75 | 158 | 2.6% |
| 3.5 | 143 | 2.4% |
| 5.5 | 142 | 2.4% |
| 2.5 | 132 | 2.2% |
| 6 | 130 | 2.2% |
| Other values (123) | 4364 |
| Value | Count | Frequency (%) |
| 1.16 | 2 | < 0.1% |
| 1.18 | 1 | < 0.1% |
| 1.2 | 1 | < 0.1% |
| 1.22 | 4 | 0.1% |
| 1.25 | 6 | 0.1% |
| 1.28 | 7 | 0.1% |
| 1.29 | 3 | 0.1% |
| 1.3 | 15 | |
| 1.33 | 21 | |
| 1.36 | 31 |
| Value | Count | Frequency (%) |
| 34 | 3 | 0.1% |
| 29 | 3 | 0.1% |
| 27 | 1 | < 0.1% |
| 26 | 13 | 0.2% |
| 23 | 14 | 0.2% |
| 21 | 18 | 0.3% |
| 19 | 45 | |
| 18 | 3 | 0.1% |
| 17 | 55 | |
| 16 | 7 | 0.1% |
| Distinct | 122 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.507676667 |
| Minimum | 1.05 |
|---|---|
| Maximum | 15 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.0 KiB |
Quantile statistics
| Minimum | 1.05 |
|---|---|
| 5-th percentile | 1.25 |
| Q1 | 1.65 |
| median | 2.1 |
| Q3 | 2.65 |
| 95-th percentile | 5.5 |
| Maximum | 15 |
| Range | 13.95 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.411207563 |
|---|---|
| Coefficient of variation (CV) | 0.5627549923 |
| Kurtosis | 7.648895308 |
| Mean | 2.507676667 |
| Median Absolute Deviation (MAD) | 0.5 |
| Skewness | 2.377039201 |
| Sum | 15046.06 |
| Variance | 1.991506787 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2.2 | 348 | 5.8% |
| 2.1 | 333 | 5.5% |
| 2 | 299 | 5.0% |
| 2.3 | 269 | 4.5% |
| 1.9 | 267 | 4.5% |
| 2.4 | 266 | 4.4% |
| 1.8 | 226 | 3.8% |
| 2.5 | 199 | 3.3% |
| 1.75 | 152 | 2.5% |
| 1.85 | 152 | 2.5% |
| Other values (112) | 3489 |
| Value | Count | Frequency (%) |
| 1.05 | 1 | < 0.1% |
| 1.08 | 1 | < 0.1% |
| 1.1 | 4 | 0.1% |
| 1.12 | 26 | 0.4% |
| 1.15 | 40 | 0.7% |
| 1.17 | 42 | |
| 1.18 | 2 | < 0.1% |
| 1.2 | 102 | |
| 1.22 | 63 | |
| 1.23 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 15 | 2 | < 0.1% |
| 11 | 8 | |
| 10 | 4 | 0.1% |
| 9.6 | 1 | < 0.1% |
| 9.5 | 10 | |
| 9 | 6 | |
| 8.7 | 1 | < 0.1% |
| 8.5 | 11 | |
| 8 | 13 | |
| 7.8 | 1 | < 0.1% |
| Distinct | 62 |
|---|---|
| Distinct (%) | 1.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.6449 |
| Minimum | 2.5 |
|---|---|
| Maximum | 10.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.0 KiB |
Quantile statistics
| Minimum | 2.5 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 3.2 |
| median | 3.3 |
| Q3 | 3.8 |
| 95-th percentile | 5.5 |
| Maximum | 10.5 |
| Range | 8 |
| Interquartile range (IQR) | 0.6 |
Descriptive statistics
| Standard deviation | 0.8145256689 |
|---|---|
| Coefficient of variation (CV) | 0.2234699632 |
| Kurtosis | 9.546301073 |
| Mean | 3.6449 |
| Median Absolute Deviation (MAD) | 0.2 |
| Skewness | 2.683333596 |
| Sum | 21869.4 |
| Variance | 0.6634520653 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3.2 | 958 | |
| 3.3 | 935 | |
| 3.1 | 656 | |
| 3 | 447 | 7.4% |
| 3.5 | 380 | 6.3% |
| 3.4 | 354 | 5.9% |
| 3.6 | 242 | 4.0% |
| 4 | 227 | 3.8% |
| 3.7 | 198 | 3.3% |
| 3.9 | 153 | 2.5% |
| Other values (52) | 1450 |
| Value | Count | Frequency (%) |
| 2.5 | 1 | < 0.1% |
| 2.6 | 1 | < 0.1% |
| 2.7 | 2 | < 0.1% |
| 2.8 | 2 | < 0.1% |
| 2.9 | 31 | 0.5% |
| 2.95 | 1 | < 0.1% |
| 3 | 447 | |
| 3.05 | 6 | 0.1% |
| 3.1 | 656 | |
| 3.15 | 18 | 0.3% |
| Value | Count | Frequency (%) |
| 10.5 | 1 | < 0.1% |
| 10 | 1 | < 0.1% |
| 9.2 | 1 | < 0.1% |
| 9.1 | 1 | < 0.1% |
| 9 | 12 | |
| 8 | 6 | 0.1% |
| 7.9 | 1 | < 0.1% |
| 7.5 | 17 | |
| 7.3 | 1 | < 0.1% |
| 7 | 17 |
| Distinct | 135 |
|---|---|
| Distinct (%) | 2.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.185253333 |
| Minimum | 1.2 |
|---|---|
| Maximum | 29 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.0 KiB |
Quantile statistics
| Minimum | 1.2 |
|---|---|
| 5-th percentile | 1.55 |
| Q1 | 2.5 |
| median | 3.2 |
| Q3 | 4.6 |
| 95-th percentile | 10.31 |
| Maximum | 29 |
| Range | 27.8 |
| Interquartile range (IQR) | 2.1 |
Descriptive statistics
| Standard deviation | 2.934304982 |
|---|---|
| Coefficient of variation (CV) | 0.7011057033 |
| Kurtosis | 6.978977883 |
| Mean | 4.185253333 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 2.28384784 |
| Sum | 25111.52 |
| Variance | 8.610145727 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3.8 | 226 | 3.8% |
| 3.2 | 207 | 3.5% |
| 2.7 | 202 | 3.4% |
| 3.3 | 196 | 3.3% |
| 2.9 | 175 | 2.9% |
| 3.6 | 174 | 2.9% |
| 3.1 | 171 | 2.9% |
| 2.6 | 165 | 2.8% |
| 2.5 | 151 | 2.5% |
| 4 | 147 | 2.5% |
| Other values (125) | 4186 |
| Value | Count | Frequency (%) |
| 1.2 | 4 | 0.1% |
| 1.22 | 1 | < 0.1% |
| 1.25 | 6 | 0.1% |
| 1.27 | 4 | 0.1% |
| 1.28 | 1 | < 0.1% |
| 1.3 | 19 | 0.3% |
| 1.33 | 6 | 0.1% |
| 1.35 | 28 | |
| 1.37 | 4 | 0.1% |
| 1.4 | 50 |
| Value | Count | Frequency (%) |
| 29 | 1 | < 0.1% |
| 28 | 1 | < 0.1% |
| 27 | 1 | < 0.1% |
| 25 | 1 | < 0.1% |
| 22 | 1 | < 0.1% |
| 20 | 12 | 0.2% |
| 18 | 10 | 0.2% |
| 17 | 1 | < 0.1% |
| 16 | 5 | 0.1% |
| 15 | 44 |
| Distinct | 226 |
|---|---|
| Distinct (%) | 3.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.590269167 |
| Minimum | 1.08 |
|---|---|
| Maximum | 21.06 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.0 KiB |
Quantile statistics
| Minimum | 1.08 |
|---|---|
| 5-th percentile | 1.22 |
| Q1 | 1.66 |
| median | 2.1 |
| Q3 | 2.75 |
| 95-th percentile | 6 |
| Maximum | 21.06 |
| Range | 19.98 |
| Interquartile range (IQR) | 1.09 |
Descriptive statistics
| Standard deviation | 1.610506916 |
|---|---|
| Coefficient of variation (CV) | 0.6217527261 |
| Kurtosis | 10.77229002 |
| Mean | 2.590269167 |
| Median Absolute Deviation (MAD) | 0.5 |
| Skewness | 2.664595455 |
| Sum | 15541.615 |
| Variance | 2.593732525 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2.1 | 277 | 4.6% |
| 2.2 | 272 | 4.5% |
| 2 | 214 | 3.6% |
| 2.25 | 197 | 3.3% |
| 1.8 | 195 | 3.2% |
| 2.5 | 184 | 3.1% |
| 1.73 | 145 | 2.4% |
| 2.38 | 145 | 2.4% |
| 1.91 | 144 | 2.4% |
| 2.4 | 126 | 2.1% |
| Other values (216) | 4101 |
| Value | Count | Frequency (%) |
| 1.08 | 3 | 0.1% |
| 1.09 | 2 | < 0.1% |
| 1.1 | 7 | 0.1% |
| 1.11 | 10 | |
| 1.12 | 12 | |
| 1.13 | 4 | 0.1% |
| 1.14 | 21 | |
| 1.15 | 7 | 0.1% |
| 1.16 | 12 | |
| 1.167 | 3 | 0.1% |
| Value | Count | Frequency (%) |
| 21.06 | 1 | < 0.1% |
| 17 | 1 | < 0.1% |
| 15 | 1 | < 0.1% |
| 13 | 4 | |
| 12.46 | 1 | < 0.1% |
| 12.1 | 1 | < 0.1% |
| 12 | 7 | |
| 11.49 | 1 | < 0.1% |
| 11.09 | 1 | < 0.1% |
| 11 | 6 |
| Distinct | 162 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.747798833 |
| Minimum | 2.75 |
|---|---|
| Maximum | 12.59 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.0 KiB |
Quantile statistics
| Minimum | 2.75 |
|---|---|
| 5-th percentile | 3.2 |
| Q1 | 3.25 |
| median | 3.4 |
| Q3 | 3.75 |
| 95-th percentile | 5.75 |
| Maximum | 12.59 |
| Range | 9.84 |
| Interquartile range (IQR) | 0.5 |
Descriptive statistics
| Standard deviation | 0.9293678598 |
|---|---|
| Coefficient of variation (CV) | 0.2479769863 |
| Kurtosis | 14.24326278 |
| Mean | 3.747798833 |
| Median Absolute Deviation (MAD) | 0.2 |
| Skewness | 3.1442089 |
| Sum | 22486.793 |
| Variance | 0.8637246188 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3.2 | 1191 | |
| 3.4 | 629 | |
| 3.25 | 610 | |
| 3.3 | 597 | |
| 3.5 | 576 | |
| 3.75 | 347 | 5.8% |
| 3.6 | 269 | 4.5% |
| 4 | 245 | 4.1% |
| 4.5 | 188 | 3.1% |
| 5 | 174 | 2.9% |
| Other values (152) | 1174 |
| Value | Count | Frequency (%) |
| 2.75 | 1 | < 0.1% |
| 2.8 | 1 | < 0.1% |
| 2.875 | 1 | < 0.1% |
| 2.88 | 5 | 0.1% |
| 2.9 | 10 | 0.2% |
| 3 | 100 | |
| 3.04 | 1 | < 0.1% |
| 3.05 | 1 | < 0.1% |
| 3.06 | 1 | < 0.1% |
| 3.08 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 12.59 | 1 | < 0.1% |
| 12.39 | 1 | < 0.1% |
| 12 | 1 | < 0.1% |
| 11.96 | 1 | < 0.1% |
| 11 | 3 | |
| 10.98 | 1 | < 0.1% |
| 10.9 | 1 | < 0.1% |
| 10.86 | 1 | < 0.1% |
| 10.68 | 1 | < 0.1% |
| 10.57 | 1 | < 0.1% |
| Distinct | 245 |
|---|---|
| Distinct (%) | 4.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.498023333 |
| Minimum | 1.17 |
|---|---|
| Maximum | 32.7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.0 KiB |
Quantile statistics
| Minimum | 1.17 |
|---|---|
| 5-th percentile | 1.53 |
| Q1 | 2.45 |
| median | 3.3 |
| Q3 | 5 |
| 95-th percentile | 12 |
| Maximum | 32.7 |
| Range | 31.53 |
| Interquartile range (IQR) | 2.55 |
Descriptive statistics
| Standard deviation | 3.472833699 |
|---|---|
| Coefficient of variation (CV) | 0.7720799652 |
| Kurtosis | 8.47182719 |
| Mean | 4.498023333 |
| Median Absolute Deviation (MAD) | 1.2 |
| Skewness | 2.462405804 |
| Sum | 26988.14 |
| Variance | 12.0605739 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 3 | 279 | 4.7% |
| 4.5 | 239 | 4.0% |
| 4 | 209 | 3.5% |
| 3.75 | 192 | 3.2% |
| 2.75 | 190 | 3.2% |
| 5 | 187 | 3.1% |
| 3.2 | 178 | 3.0% |
| 3.5 | 165 | 2.8% |
| 2.5 | 162 | 2.7% |
| 2.8 | 159 | 2.6% |
| Other values (235) | 4040 |
| Value | Count | Frequency (%) |
| 1.17 | 1 | < 0.1% |
| 1.18 | 2 | < 0.1% |
| 1.2 | 2 | < 0.1% |
| 1.22 | 5 | |
| 1.25 | 5 | |
| 1.27 | 1 | < 0.1% |
| 1.28 | 5 | |
| 1.29 | 10 | |
| 1.3 | 9 | |
| 1.31 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 32.7 | 2 | < 0.1% |
| 29 | 3 | |
| 28.68 | 1 | < 0.1% |
| 27.95 | 1 | < 0.1% |
| 27.28 | 1 | < 0.1% |
| 26 | 6 | |
| 24.79 | 1 | < 0.1% |
| 24.65 | 1 | < 0.1% |
| 23.72 | 1 | < 0.1% |
| 23 | 5 |
| Distinct | 107 |
|---|---|
| Distinct (%) | 1.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.63024 |
| Minimum | 1.06 |
|---|---|
| Maximum | 17 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.0 KiB |
Quantile statistics
| Minimum | 1.06 |
|---|---|
| 5-th percentile | 1.22 |
| Q1 | 1.66 |
| median | 2.15 |
| Q3 | 2.75 |
| 95-th percentile | 6 |
| Maximum | 17 |
| Range | 15.94 |
| Interquartile range (IQR) | 1.09 |
Descriptive statistics
| Standard deviation | 1.647163127 |
|---|---|
| Coefficient of variation (CV) | 0.6262406195 |
| Kurtosis | 8.588908896 |
| Mean | 2.63024 |
| Median Absolute Deviation (MAD) | 0.54 |
| Skewness | 2.523792581 |
| Sum | 15781.44 |
| Variance | 2.713146367 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2.5 | 228 | 3.8% |
| 2.1 | 201 | 3.4% |
| 2.3 | 195 | 3.2% |
| 2.25 | 171 | 2.9% |
| 2 | 167 | 2.8% |
| 2.4 | 166 | 2.8% |
| 2.2 | 163 | 2.7% |
| 1.8 | 149 | 2.5% |
| 2.15 | 146 | 2.4% |
| 1.83 | 140 | 2.3% |
| Other values (97) | 4274 |
| Value | Count | Frequency (%) |
| 1.06 | 2 | < 0.1% |
| 1.07 | 2 | < 0.1% |
| 1.08 | 2 | < 0.1% |
| 1.1 | 11 | 0.2% |
| 1.11 | 6 | 0.1% |
| 1.12 | 13 | 0.2% |
| 1.14 | 35 | |
| 1.15 | 4 | 0.1% |
| 1.16 | 18 | |
| 1.17 | 41 |
| Value | Count | Frequency (%) |
| 17 | 2 | < 0.1% |
| 15 | 2 | < 0.1% |
| 13 | 3 | 0.1% |
| 12 | 5 | 0.1% |
| 11 | 14 | |
| 10.5 | 1 | < 0.1% |
| 10 | 14 | |
| 9.5 | 10 | |
| 9 | 21 | |
| 8.5 | 14 |
| Distinct | 42 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.649625 |
| Minimum | 2.8 |
|---|---|
| Maximum | 13 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.0 KiB |
Quantile statistics
| Minimum | 2.8 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 3.2 |
| median | 3.3 |
| Q3 | 3.75 |
| 95-th percentile | 5.5 |
| Maximum | 13 |
| Range | 10.2 |
| Interquartile range (IQR) | 0.55 |
Descriptive statistics
| Standard deviation | 0.8885036199 |
|---|---|
| Coefficient of variation (CV) | 0.2434506613 |
| Kurtosis | 14.56153094 |
| Mean | 3.649625 |
| Median Absolute Deviation (MAD) | 0.2 |
| Skewness | 3.133558689 |
| Sum | 21897.75 |
| Variance | 0.7894386825 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=42)
| Value | Count | Frequency (%) |
| 3.2 | 990 | |
| 3.1 | 980 | |
| 3.3 | 754 | |
| 3.4 | 438 | 7.3% |
| 3.5 | 339 | 5.7% |
| 3.6 | 276 | 4.6% |
| 3.25 | 275 | 4.6% |
| 3 | 271 | 4.5% |
| 4 | 223 | 3.7% |
| 3.75 | 207 | 3.5% |
| Other values (32) | 1247 |
| Value | Count | Frequency (%) |
| 2.8 | 7 | 0.1% |
| 2.87 | 15 | 0.2% |
| 2.88 | 5 | 0.1% |
| 2.9 | 43 | 0.7% |
| 3 | 271 | 4.5% |
| 3.1 | 980 | |
| 3.2 | 990 | |
| 3.25 | 275 | 4.6% |
| 3.3 | 754 | |
| 3.4 | 438 |
| Value | Count | Frequency (%) |
| 13 | 1 | < 0.1% |
| 12 | 2 | < 0.1% |
| 11 | 3 | 0.1% |
| 10 | 5 | 0.1% |
| 9.5 | 1 | < 0.1% |
| 9 | 9 | 0.1% |
| 8.5 | 4 | 0.1% |
| 8 | 11 | 0.2% |
| 7.5 | 17 | |
| 7 | 35 |
| Distinct | 102 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.548745 |
| Minimum | 1.14 |
|---|---|
| Maximum | 29 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 47.0 KiB |
Quantile statistics
| Minimum | 1.14 |
|---|---|
| 5-th percentile | 1.53 |
| Q1 | 2.5 |
| median | 3.3 |
| Q3 | 5 |
| 95-th percentile | 12 |
| Maximum | 29 |
| Range | 27.86 |
| Interquartile range (IQR) | 2.5 |
Descriptive statistics
| Standard deviation | 3.556692858 |
|---|---|
| Coefficient of variation (CV) | 0.7819064067 |
| Kurtosis | 7.004574426 |
| Mean | 4.548745 |
| Median Absolute Deviation (MAD) | 1.15 |
| Skewness | 2.36912121 |
| Sum | 27292.47 |
| Variance | 12.65006409 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4 | 200 | 3.3% |
| 4.5 | 195 | 3.2% |
| 5 | 194 | 3.2% |
| 3 | 179 | 3.0% |
| 8 | 174 | 2.9% |
| 3.2 | 169 | 2.8% |
| 3.5 | 165 | 2.8% |
| 6 | 161 | 2.7% |
| 3.1 | 159 | 2.6% |
| 5.5 | 159 | 2.6% |
| Other values (92) | 4245 |
| Value | Count | Frequency (%) |
| 1.14 | 1 | < 0.1% |
| 1.15 | 1 | < 0.1% |
| 1.16 | 1 | < 0.1% |
| 1.2 | 2 | < 0.1% |
| 1.22 | 2 | < 0.1% |
| 1.25 | 12 | |
| 1.27 | 1 | < 0.1% |
| 1.28 | 5 | |
| 1.29 | 4 | 0.1% |
| 1.3 | 9 |
| Value | Count | Frequency (%) |
| 29 | 3 | 0.1% |
| 26 | 9 | 0.1% |
| 23 | 4 | 0.1% |
| 21 | 28 | 0.5% |
| 19 | 22 | 0.4% |
| 17 | 37 | 0.6% |
| 15 | 103 | |
| 13 | 55 | |
| 12 | 94 | |
| 11 | 103 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| FTR | B365H | B365D | B365A | IWH | IWD | IWA | LBH | LBD | LBA | WHH | WHD | WHA | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | NH | 1.727 | 3.25 | 4.333 | 1.80 | 3.1 | 3.8 | 1.615 | 3.25 | 5.00 | 1.66 | 3.3 | 4.50 |
| 1 | NH | 2.800 | 3.25 | 2.200 | 2.90 | 3.0 | 2.2 | 2.800 | 3.20 | 2.20 | 2.75 | 3.1 | 2.30 |
| 2 | NH | 2.250 | 3.25 | 2.750 | 2.30 | 3.0 | 2.7 | 2.250 | 3.20 | 2.75 | 2.30 | 3.1 | 2.75 |
| 3 | H | 1.727 | 3.25 | 4.333 | 1.80 | 3.1 | 3.8 | 1.833 | 3.20 | 3.75 | 1.72 | 3.2 | 4.33 |
| 4 | H | 1.667 | 3.40 | 4.500 | 1.70 | 3.2 | 4.2 | 1.615 | 3.50 | 4.50 | 1.66 | 3.3 | 4.50 |
| 5 | H | 1.200 | 5.00 | 12.000 | 1.20 | 5.0 | 10.0 | 1.200 | 5.00 | 11.00 | 1.20 | 5.0 | 11.00 |
| 6 | H | 1.222 | 5.00 | 10.000 | 1.25 | 4.5 | 9.0 | 1.250 | 4.50 | 10.00 | 1.22 | 5.0 | 9.50 |
| 7 | NH | 3.000 | 3.25 | 2.100 | 2.90 | 3.0 | 2.2 | 3.000 | 3.20 | 2.10 | 3.20 | 3.0 | 2.10 |
| 8 | H | 1.571 | 3.40 | 5.500 | 1.65 | 3.3 | 4.4 | 1.615 | 3.50 | 4.50 | 1.57 | 3.5 | 5.00 |
| 9 | NH | 2.750 | 3.25 | 2.250 | 2.50 | 3.0 | 2.5 | 2.750 | 3.20 | 2.25 | 2.70 | 3.0 | 2.37 |
Last rows
| FTR | B365H | B365D | B365A | IWH | IWD | IWA | LBH | LBD | LBA | WHH | WHD | WHA | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5990 | H | 2.62 | 3.2 | 3.00 | 2.60 | 3.10 | 2.95 | 2.72 | 3.15 | 2.99 | 2.62 | 3.10 | 2.88 |
| 5991 | NH | 2.90 | 3.6 | 2.50 | 2.85 | 3.55 | 2.40 | 3.03 | 3.59 | 2.44 | 2.90 | 3.50 | 2.38 |
| 5992 | NH | 2.60 | 3.1 | 3.10 | 2.50 | 3.20 | 2.95 | 2.73 | 3.05 | 3.08 | 2.62 | 3.00 | 3.00 |
| 5993 | H | 1.22 | 7.5 | 13.00 | 1.22 | 6.50 | 12.50 | 1.22 | 7.01 | 13.26 | 1.20 | 7.00 | 13.00 |
| 5994 | H | 1.18 | 8.0 | 19.00 | 1.20 | 7.00 | 13.00 | 1.18 | 7.74 | 18.36 | 1.15 | 8.00 | 15.00 |
| 5995 | H | 1.64 | 3.9 | 6.00 | 1.70 | 3.80 | 5.00 | 1.63 | 3.99 | 6.37 | 1.60 | 3.90 | 6.00 |
| 5996 | H | 1.95 | 3.6 | 4.33 | 2.00 | 3.40 | 3.80 | 2.02 | 3.48 | 4.20 | 1.95 | 3.50 | 4.00 |
| 5997 | NH | 7.50 | 4.5 | 1.50 | 7.20 | 4.40 | 1.45 | 7.64 | 4.37 | 1.51 | 7.00 | 4.33 | 1.47 |
| 5998 | H | 1.57 | 4.5 | 6.00 | 1.65 | 4.00 | 5.10 | 1.53 | 4.51 | 6.76 | 1.52 | 4.33 | 6.00 |
| 5999 | H | 4.10 | 3.9 | 1.90 | 3.75 | 3.80 | 1.90 | 3.87 | 3.90 | 1.98 | 3.90 | 3.75 | 1.91 |
Most frequently occurring
| FTR | B365H | B365D | B365A | IWH | IWD | IWA | LBH | LBD | LBA | WHH | WHD | WHA | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | H | 1.20 | 6.50 | 15.00 | 1.20 | 5.20 | 11.0 | 1.20 | 6.00 | 15.00 | 1.20 | 6.0 | 15.0 | 2 |
| 1 | H | 1.20 | 7.50 | 17.00 | 1.22 | 5.50 | 12.0 | 1.22 | 6.00 | 13.00 | 1.22 | 5.5 | 15.0 | 2 |
| 2 | H | 1.29 | 5.50 | 11.00 | 1.30 | 4.80 | 8.5 | 1.29 | 5.50 | 10.00 | 1.29 | 5.5 | 10.0 | 2 |
| 3 | H | 1.40 | 4.50 | 8.50 | 1.35 | 4.50 | 7.3 | 1.40 | 4.50 | 8.00 | 1.44 | 4.0 | 8.0 | 2 |
| 4 | H | 1.85 | 3.60 | 4.75 | 1.90 | 3.45 | 3.8 | 1.90 | 3.50 | 4.33 | 1.91 | 3.1 | 4.5 | 2 |
| 5 | H | 1.90 | 3.10 | 4.33 | 1.90 | 3.10 | 3.5 | 1.83 | 3.20 | 3.75 | 1.90 | 3.2 | 3.5 | 2 |
| 6 | H | 2.40 | 3.25 | 3.00 | 2.50 | 3.20 | 2.6 | 2.38 | 3.25 | 3.00 | 2.40 | 3.2 | 3.0 | 2 |
| 7 | NH | 1.44 | 4.00 | 7.50 | 1.45 | 3.80 | 5.4 | 1.44 | 3.60 | 6.50 | 1.50 | 3.4 | 6.0 | 2 |
| 8 | NH | 1.44 | 4.20 | 8.00 | 1.40 | 4.30 | 7.5 | 1.40 | 4.50 | 8.00 | 1.44 | 4.0 | 8.0 | 2 |
| 9 | NH | 1.83 | 3.50 | 4.50 | 1.85 | 3.40 | 3.8 | 1.80 | 3.50 | 4.50 | 1.83 | 3.4 | 4.5 | 2 |